About this Journal  |  Author Guidelines  |   Submit a Manuscript     

International Journal of Artificial Intelligence and Applications for Smart Devices

Volume 5, No. 2, 2017, pp 1-6
http://dx.doi.org/10.21742/ijaiasd.2017.5.2.01

Abstract



Korean 5W1H Extraction Using Rule-based and Machine Learning Methods



    Mei-ying Ren1 and Sin-jae Kang2
    1,2Dept. of Computer & Information Engineering

    Abstract

    Abstract In news text summarization, title based method and lead text based method are commonly used techniques. However, because of the features of news article, titles or lead texts does not usually contain much information nor emphasize what the writer wants to stress. The paper tries to extract 5W1H elements, which are considered the core information in news articles, to summarize the text. Therefore, the extraction technique for 5W1H is needed. The previous 1studies used rule-based method to extract 5W1H elements and utilized them in weighting sentences in order to conduct extractive summarization. However, one problem is that it is difficult to construct rules that can deal with various cases. This paper conducted filtering using simple and high-precision rules first. Then, CRF based labeling was performed to extract 5W1H elements from news texts. To get better results, we conduct named entity recognition and simple coreference resolution.


 

Contact Us

  • PO Box 5074, Sandy Bay Tasmania 7005, Australia
  • Phone: +61 3 9028 5994